Spectral methods for voice source parameters estimation
نویسندگان
چکیده
A spectral approach is proposed for voice source parameters representation and estimation. Parameter estimation is based on decomposition of the periodic and the aperiodic components of the speech signal, and on spectral modelling of the periodic component. The paper focusses on parameters estimation for the periodic component of the glottal ow. A new anticausal all-pole model of the glottal ow is derived. Glottal ow is seen as an anticausal 2-pole lter followed by a spectral tilt lter. The anticausal lter has complex poles, instead of the real poles that are usually assumed. Time-domain and frequency domain parameters are linked by analytic formulas. Two spectral domain algorithms are proposed for estimation of open quotient. The rst one is based on measurement of the rst harmonics, and the second one is based on spectral modelling. Experimental results demonstrate the accuracy of the estimation procedures. 1. INTRODUCTION In this paper we present our recent work on algorithms for automatic analysis of voice quality parameters. These parameters are studied in the spectral domain. Voice source measurements are needed for high quality speech synthesis, because voice quality is currently a key issue for naturalness. This is particularly true in the situation of speech synthesis using concatenation of natural speech segments (e.g. diphones or non uniforms units). Diierences in voice source, for instance diierences in vocal eeort, are often perceived by listeners as synthesis or concatenation errors, because of the change in quality across segments. Contrary to most of the recent works on source modelling, we prefered spectral processing, because it has a number of advantages:
منابع مشابه
Using Context-based Statistical Models to Promote the Quality of Voice Conversion Systems
This article aims to examine methods of optimizing GMM-based voice conversion systems performance in which GMM method is introduced as the basic method for improvement of voice conversion systems performance. In the current methods, due to using a single conversion function to convert all speech units and subsequent spectral smoothing arising from statistical averaging, we will observe quality ...
متن کاملAutomatic parametrization of voice source signals: a novel evaluation procedure is used to compare methods and test the effect of low-pass filtering
There is a need for automatic methods for parametrization of the voice source signals. Representatives of the two types of methods that have been used most often for parametrization were tested and compared. For this purpose a novel evaluation procedure is proposed which makes it possible to perform the numerous tests needed for a detailed comparison of the methods. This evaluation procedure re...
متن کاملProbability models of formant parameters for voice conversion
This paper explores the estimation and mapping of probability models of formant parameter vectors for voice conversion. The formant parameter vectors consist of the frequency, bandwidth and intensity of resonance at formants. Formant parameters are derived from the coefficients of a linear prediction (LP) model of speech. The formant distributions are modelled with phonemedependent two-dimensio...
متن کاملA Novel Source Analysis Method by Matching Spectral Characters of LF Model with STRAIGHT Spectrum
This paper presents a voice source analysis method by studying the spectral characters of LF model and their representation in output speech signal. The estimation of source features is defined as the set of LF parameter whose spectrum has the most similar characters in frequency domain, including glottal formant and spectral tilt, with the corresponding characters held by the STRAIGHT spectrum...
متن کاملMixed source model and its adapted vocal tract filter estimate for voice transformation and synthesis
In current methods for voice transformation and speech synthesis, the vocal tract filter is usually assumed to be excited by a flat amplitude spectrum. In this article, we present a method using a mixed source model defined as a mixture of the Liljencrants–Fant (LF) model and Gaussian noise. Using the LF model, the base approach used in this presented work is therefore close to a vocoder using ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1997